AITopics | model checking

Collaborating Authors

model checking

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Neural Model Checking

Neural Information Processing SystemsMar-21-2026, 19:18:12 GMT

We introduce a machine learning approach to model checking temporal logic, with application to formal hardware verification. Model checking answers the question of whether every execution of a given system satisfies a desired temporal logic specification. Unlike testing, model checking provides formal guarantees. Its application is expected standard in silicon design and the EDA industry has invested decades into the development of performant symbolic model checking algorithms. Our new approach combines machine learning and symbolic reasoning by using neural networks as formal proof certificates for linear temporal logic. We train our neural certificates from randomly generated executions of the system and we then symbolically check their validity using satisfiability solving which, upon the affirmative answer, establishes that the system provably satisfies the specification. We leverage the expressive power of neural networks to represent proof certificates as well as the fact that checking a certificate is much simpler than finding one. As a result, our machine learning procedure for model checking is entirely unsupervised, formally sound, and practically effective. We experimentally demonstrate that our method outperforms the state-of-the-art academic and commercial model checkers on a set of standard hardware designs written in SystemVerilog.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Add feedback

9d0947107ea92d6ce369dce7749180dd-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 23:27:29 GMT

artificial intelligence, logic & formal reasoning, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Europe > United Kingdom > England > West Midlands > Birmingham (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report > New Finding (0.92)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.96)

Add feedback

Formal that "Floats" High: Formal Verification of Floating Point Arithmetic

Mohanty, Hansa, Viswambharan, Vaisakh Naduvodi, Gadde, Deepak Narayan

arXiv.org Artificial IntelligenceDec-9-2025

Formal verification of floating-point arithmetic remains challenging due to non-linear arithmetic behavior and the tight coupling between control and datapath logic. Existing approaches often rely on high-level C models for equivalence checking against Register Transfer Level (RTL) designs, but this introduces abstraction gaps, translation overhead, and limits scalability at the RTL level. To address these challenges, this paper presents a scalable methodology for verifying floating-point arithmetic using direct RTL-to-RTL model checking against a golden reference model. The approach adopts a divide-and conquer strategy that decomposes verification into modular stages, each captured by helper assertions and lemmas that collectively prove a main correctness theorem. Counterexample (CEX)-guided refinement is used to iteratively localize and resolve implementation defects, while targeted fault injection validates the robustness of the verification process against precision-critical datapath errors. To assess scalability and practicality, the methodology is extended with agentic AI-based formal property generation, integrating large language model (LLM)-driven automation with Human-in-the-Loop (HITL) refinement. Coverage analysis evaluates the effectiveness of the approach by comparing handwritten and AI-generated properties in both RTL-to-RTL model checking and standalone RTL verification settings. Results show that direct RTL-to-RTL model checking achieves higher coverage efficiency and requires fewer assertions than standalone verification, especially when combined with AI-generated properties refined through HITL guidance.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2512.0685

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

Formal Verification of Probabilistic Multi-Agent Systems for Ballistic Rocket Flight Using Probabilistic Alternating-Time Temporal Logic

Kurpiewski, Damian, Michalczyk, Jędrzej, Jamroga, Wojciech, Michalski, Jerzy Julian, Sidoruk, Teofil

arXiv.org Artificial IntelligenceDec-1-2025

This technical report presents a comprehensive formal verification approach for probabilistic agent systems modeling ballistic rocket flight trajectories using Probabilistic Alternating-Time Temporal Logic (PATL). We describe an innovative verification framework specifically designed for analyzing critical safety properties of ballistic rockets engineered to achieve microgravity conditions for scientific experimentation. Our model integrates authentic flight telemetry data encompassing velocity vectors, pitch angles, attitude parameters, and GPS coordinates to construct probabilistic state transition systems that rigorously account for environmental stochasticity, particularly meteorological variability. We formalize mission-critical safety properties through PATL specifications to systematically identify trajectory deviation states where the rocket risks landing in prohibited or hazardous zones. The verification framework facilitates real-time safety monitoring and enables automated intervention mechanisms, including emergency engine disengagement protocols, when predefined safety thresholds are exceeded. Experimental validation demonstrates the practical effectiveness and reliability of our approach in ensuring mission safety while maintaining scientific mission objectives.

artificial intelligence, time precision, verification result, (14 more...)

arXiv.org Artificial Intelligence

2511.22572

Genre: Research Report (0.64)

Industry:

Transportation > Air (1.00)
Health & Medicine (1.00)
Aerospace & Defense (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Bridging LLM Planning Agents and Formal Methods: A Case Study in Plan Verification

Ramani, Keshav, Tawosi, Vali, Alamir, Salwa, Borrajo, Daniel

arXiv.org Artificial IntelligenceNov-25-2025

We introduce a novel framework for evaluating the alignment between natural language plans and their expected behavior by converting them into Kripke structures and Linear Temporal Logic (LTL) using Large Language Models (LLMs) and performing model checking. We systematically evaluate this framework on a simplified version of the PlanBench plan verification dataset and report on metrics like Accuracy, Precision, Recall and F1 scores. Our experiments demonstrate that GPT-5 achieves excellent classification performance (F1 score of 96.3%) while almost always producing syntactically perfect formal representations that can act as guarantees. However, the synthesis of semantically perfect formal models remains an area for future exploration.

large language model, logic & formal reasoning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.03469

Country: Europe (0.47)

Genre: Research Report > New Finding (0.68)

Industry: Banking & Finance (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

ATL*AS: An Automata-Theoretic Approach and Tool for the Verification of Strategic Abilities in Multi-Agent Systems

Garcia-Alcalde, Sofia Garcia de Blas, Belardinelli, Francesco

arXiv.org Artificial IntelligenceOct-22-2025

We present two novel symbolic algorithms for model checking the Alternating-time Temporal Logic ATL*, over both the infinite-trace and the finite-trace semantics. In particular, for infinite traces we design a novel symbolic reduction to parity games. We implement both methods in the ATL*AS model checker and evaluate it using synthetic benchmarks as well as a cybersecurity scenario. Our results demonstrate that the symbolic approach significantly outperforms the explicit-state representation and we find that our parity-game-based algorithm offers a more scalable and efficient solution for infinite-trace verification, outperforming previously available tools. Our results also confirm that finite-trace model checking yields substantial performance benefits over infinite-trace verification. As such, we provide a comprehensive toolset for verifying multiagent systems against specifications in ATL*.

algorithm, artificial intelligence, parity game, (16 more...)

arXiv.org Artificial Intelligence

2510.17306

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Neural Model Checking Mirco Giacobbe University of Birmingham, UK Daniel Kroening

Neural Information Processing SystemsOct-10-2025, 11:20:51 GMT

We introduce a machine learning approach to model checking temporal logic, with application to formal hardware verification.

model checking, oot, springer, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > West Midlands > Birmingham (0.40)
North America > United States (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report > New Finding (0.92)

Industry: Information Technology (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Verifying Memoryless Sequential Decision-making of Large Language Models

Gross, Dennis, Spieker, Helge, Gotlieb, Arnaud

arXiv.org Artificial IntelligenceOct-9-2025

We introduce a tool for rigorous and automated verification of large language model (LLM)- based policies in memoryless sequential decision-making tasks. Given a Markov decision process (MDP) representing the sequential decision-making task, an LLM policy, and a safety requirement expressed as a PCTL formula, our approach incrementally constructs only the reachable portion of the MDP guided by the LLM's chosen actions. Each state is encoded as a natural language prompt, the LLM's response is parsed into an action, and reachable successor states by the policy are expanded. The resulting formal model is checked with Storm to determine whether the policy satisfies the specified safety property. In experiments on standard grid world benchmarks, we show that open source LLMs accessed via Ollama can be verified when deterministically seeded, but generally underperform deep reinforcement learning baselines. Our tool natively integrates with Ollama and supports PRISM-specified tasks, enabling continuous benchmarking in user-specified sequential decision-making tasks and laying a practical foundation for formally verifying increasingly capable LLMs.

large language model, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2510.06756

Genre: Research Report > New Finding (0.46)

Industry: Transportation (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Learning Representations Through Contrastive Neural Model Checking

Krsmanovic, Vladimir, Cosler, Matthias, Ghanem, Mohamed, Finkbeiner, Bernd

arXiv.org Artificial IntelligenceOct-7-2025

Model checking is a key technique for verifying safety-critical systems against formal specifications, where recent applications of deep learning have shown promise. However, while ubiquitous for vision and language domains, representation learning remains underexplored in formal verification. We introduce Contrastive Neural Model Checking (CNML), a novel method that leverages the model checking task as a guiding signal for learning aligned representations. CNML jointly embeds logical specifications and systems into a shared latent space through a self-supervised contrastive objective. On industry-inspired retrieval tasks, CNML considerably outperforms both algorithmic and neural baselines in cross-modal and intra-modal settings. We further show that the learned representations effectively transfer to downstream tasks and generalize to more complex formulas. These findings demonstrate that model checking can serve as an objective for learning representations for formal languages.

logic & formal reasoning, machine learning, specification, (17 more...)

arXiv.org Artificial Intelligence

2510.01853

Country:

Europe (1.00)
Asia (0.68)
North America > United States > California (0.67)

Genre:

Instructional Material (0.94)
Research Report > New Finding (0.48)

Industry: Information Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

AgentGuard: Runtime Verification of AI Agents

Koohestani, Roham

arXiv.org Artificial IntelligenceSep-30-2025

The rapid evolution to autonomous, agentic AI systems introduces significant risks due to their inherent unpredictability and emergent behaviors; this also renders traditional verification methods inadequate and necessitates a shift towards probabilistic guarantees where the question is no longer if a system will fail, but the probability of its failure within given constraints. This paper presents AgentGuard, a framework for runtime verification of Agentic AI systems that provides continuous, quantitative assurance through a new paradigm called Dynamic Probabilistic Assurance. AgentGuard operates as an inspection layer that observes an agent's raw I/O and abstracts it into formal events corresponding to transitions in a state model. It then uses online learning to dynamically build and update a Markov Decision Process (MDP) that formally models the agent's emergent behavior. Using probabilistic model checking, the framework then verifies quantitative properties in real-time.

agent, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2509.23864

Country: Europe (0.68)

Genre: Research Report (0.64)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

Add feedback